home *** CD-ROM | disk | FTP | other *** search
- -*- Mode: Text -*-
-
- Copyright (c) 1985, 1986, 1987, 1988, 1989 Xerox Corporation.
- All rights reserved.
-
- Use and copying of this document is permitted. Any distribution of this
- document must comply with all applicable United States export control
- laws.
-
- Last updated: 6/3/89 by Gregor
- 10/26/89 by Gregor -- added :RETURN, removed :ISHIFT
-
- This file contains documentation of the PCL abstract LAP code. Any port
- of PCL is required to implement the abstract LAP code interface. There
- is a portable, relatively good performance implementation in the file
- lap.lisp, port-specific implementations are in that file as well.
-
- The PCL abstract LAP code mechanism exists to provide PCL with a way to
- create high-performance method lookup functions. Using this mechanism,
- PCL can produce "LAP closures" which do the method lookup. By allowing
- PCL to specify these closures using abstract LAP code rather that Lisp
- code we hope to achieve the following:
-
- * Better runtime performance. By using abstract LAP code, we
- will get better machine instruction sequences than we would
- from compiling Lisp code.
-
- * Better load and update time performance. Because it should
- be possible to "assemble" the LAP code more quickly than
- compiling Lisp code, PCL will spend less time building the
- method lookup code.
-
- * Ability to use PCL without a compiler. The LAP assembler will
- still be required but this should be much smaller than the full
- lisp compiler.
-
- Of course, not all implementations of the LAP code mechanism will
- satisfy all of these goals. The first is the most important.
-
- In particular, many PCL ports will use the portable LAP implementation.
- KCL will use the portable implementation in all of its ports. Other
- Lisps may have custom LAP implementations for some ports and use the
- portable implementation for other ports.
-
- Some Lisps will have a custom LAP implementation but will nonetheless
- require the compiler to be loaded to generate LAP closure constructors.
-
- An important point is why we have chosen to take this route rather than
- have each implementation implement the method lookup codes itself. This
- was done because we are, at PARC, just beginning to study cache behavior
- for CLOS programs. As we learn more about this we will want to modify
- the caching strategy PCL uses. This architecture, because it leaves
- PCL to implement caching behavior makes it possible to do this. Once
- this study is complete, implementations may want to do their own, ultra
- high performance implementations of caching strategies.
-
-
-
- Production of LAP closures is a two step process. In the first step, a
- port-specific function is called to take abstract LAP code and produce a
- a "lap closure generator". Lap closure generators are functions which
- are called with a set of closure variable values and return a LAP
- closure.
-
- The intermediary of the lap closure generators provides an important
- optimization. Because it is assumed that producing the LAP closure
- generator can take much longer than producing a LAP closure from the
- generator, PCL attempts to make only one closure generator for each
- sequence of LAP code. Because of the way PCL generates the LAP code
- sequences, this is quite easy for it to do.
-
- The rest of this document is divided into six parts.
-
- * the metatypes std-instance and fsc-instance
-
- * an abstraction for simple vector indices
-
- * important optimizations
-
- * the port specific function for making lap closure generators
-
- * the actual abstract LAP code
-
- * examples
-
- *** The metatypes STD-INSTANCE and FSC-INSTANCE ***
-
- In PCL, instances with metaclass STANDARD-CLASS are represented using
- the metatype STD-INSTANCE. (Note that in Cinco de Mayo PCL, this
- metatype is called IWMC-CLASS.) Each port must implement this metatype.
- The metatype could be implemented by the following DEFSTRUCT.
-
- (defstruct (std-instance
- (:predicate std-instance-p)
- (:conc-name %std-instance-)
- (:constructor %allocate-std-instance (wrapper slots))
- (:constructor %allocate-std-instance-1 ())
- (:print-function print-std-instance))
- (wrapper nil)
- (slots nil))
-
- PCL itself will guarantee correct access to this structure and the
- accessors and constructors. With this in mind, the following are
- important.
-
- * Being able to type test this structure quickly is critical. See
- the :STD-INSTANCE-P opcode.
-
- * The allocation functions should compile inline, do no argument
- checking and be as fast as possible.
-
- * The accessor functions should compile inline, do no checking of their
- arguments and be as fast as possible. SETF of the accessors should
- do the same.
-
- The port is also required to implement the metatype FSC-INSTANCE (called
- FUNCALLABLE-INSTANCE, or FIN for short, in Cinco de Mayo PCL). Objects
- with this metatype are used, among other things, to implement generic
- functions. These objects have field structure associated with them and
- are also functions that can be applied to arguments. The fields are the
- same as those for STD-INSTANCE, the FSC-INSTANCE metatype has
- predicates, print-functions, constructors and accessors as follows:
-
- fsc-instance-p
- print-fsc-instance
- %fsc-instance-wrapper
- %fsc-instance-slots
- %allocate-fsc-instance (wrapper slots)
- %allocate-fsc-instance-1 ()
-
- In addition, objects of metatype FSC-INSTANCE have a property called the
- funcallable instance function. When an FSC-INSTANCE is applied to
- arguments, the funcallable instance function is what is actually called.
- The funcallable instance function of an FSC-INSTANCE can be changed
- using the function SET-FUNCALLABLE-INSTANCE-FUNCTION. There is no
- mechanism for obtaining the funcallable instance function of an
- FSC-INSTANCE.
-
- It is possible to implement the FSC-INSTANCE metatype in pure Common
- Lisp. A simple implementation which uses lexical closures as the
- instances and a hash table to record that the lexical closures are of
- metatype FSC-INSTANCE is easy to write. Unfortunately, this
- implementation adds significant overhead:
-
- to generic-function-invocation (1 function call)
- to slot-access (1 function call or one hash table lookup)
- to class-of a generic-function (1 hash-table lookup)
-
- In addition, it would prevent the FSC-INSTANCEs from being garbage
- collected. In short, the pure Common Lisp implementation really isn't
- practical.
-
- Note that previous implementations of FINS were always based on the
- lexical closure metatype. In some ports, that provides poor
- performance. Those ports may want to consider reimplementing to use the
- compiled code metatype. In that implementation strategy, LAP closure
- variables would become constants of the compiled code object.
-
- The following note from JonL is of interest when working on a FIN
- implementation:
-
- Date: Tue, 16 May 89 05:45:56 PDT
- From: Jon L White <jonl@lucid.com>
-
- This isn't a bug in Lucid's compiler -- it's a lurking bug in PCL
- that will "bite" most implementations where different settings of
- the compiler optimization switches will produce morphologically
- different (but of course functionally equivalent) function objects.
-
- The difficulty is in how discriminator codes service cache misses.
- They "call out" to (potentially) random functions that will in some
- cases "smash" the function object that was actually running as the
- discriminator code. This is all right providing you don't return to
- that function frame, but alas ...
-
- I know this is a more extensive problem because the code in the
- port-independent function 'notice-methods-change' goes out of
- its way to do a tail-recursive call to the function that is going
- to smash the possibly-executing discriminator code. Here is the
- commentary from that code (sic):
-
- ;; In order to prevent this we take a simple measure: we just
- ;; make sure that it doesn't try to reference our its own closure
- ;; variables after it makes the dcode change. This is done by
- ;; having notice-methods-change-2 do the work of making the change
- ;; AND calling the actual generic function (a closure variable)
- ;; over. This means that at the time the dcode change is made,
- ;; there is a pointer to the generic function on the stack where
- ;; it won't be affected by the change to the closure variables.
-
-
- A similar thing should be done in the construction of standard-accessor,
- checking, and caching dcodes. In an experimental version here at Lucid,
- I rewrote dcode.lisp to do that, and there is no problem with it.
- Although that code is somewhat Lucid-specific, it could be of help to
- someone who wanted to rewrite the generic dcode.lisp (no pun intended).
- Contact me privately if you are interested.
-
- Doing a tail-recursive call out of dcodes when there is a cache miss
- is a good thing, regardless of other problems. I think one might as
- well do it. However, I should point out that in the presence of
- multiprocessing, there is another more serious problem that cannot be
- solved so simply. Think about what happens when one process decides
- to update a dcode while another process is still using it; no such
- stack-maintenance discipline will fix this case. A tail-recursive
- exit from the dcode will *immensely* reduce the likelihood that
- another process can sneak in during the interval in which the dcode
- requires consistency in its function; but it can't reduce that
- likelihood to zero.
-
- The more desirable thing to do is to put the whole "dcode" down one
- more level of indirection through the symbol-function cell of the
- generic function. This is effectively what PCL's 'make-trampoline'
- function does, but unfortunately that is not a very efficient approach
- when you consider how most compilers will compile it. Something akin
- to the "mattress-pads" in Steve Haflich's code (in the fin.lisp file)
- could probably be done for many other implementations as well.
-
-
- *** Index Operations ***
-
- Indexes are an abstraction for indexes into a simple vector. This
- abstraction is used to make it possible to generate more efficient
- code to access simple vectors. The idea being that this may make it
- possible to use alternate addressing modes to address these.
-
- The "index value" of an index is defined to be the fixnum of which that
- index is an alternate form. So, using the Lisp function SVREF with the
- index value of an index accesses the same element as using the index
- with the appropriate access function or operand.
-
- The format of an index is unspecified, but is assumed to be something
- like a fixnum with certain bits ignored. Accessing a vector using an
- index must be done using the appropriate special accessor function or
- opcode.
-
- Conversion from index values to indices and vice-versa can be done with
- the following functions:
-
- INDEX-VALUE->INDEX (index-value)
- INDEX->INDEX-VALUE (index)
-
-
- The following constant indicates the maximum index value an index can
- have in a given port. This must be at least 2^16.
-
- INDEX-VALUE-LIMIT - a fixnum, must be at least 2^16.
-
-
- MAKE-INDEX-MASK (<cache-size> <line-size>)
-
- This function is used to make index masks. Because I am lazy, I show an
- implementation of it in the common case where indexes are just fixnums:
-
- (defun make-index-mask (cache-size line-size)
- (let ((cache-size-in-bits (floor (log cache-size 2)))
- (line-size-in-bits (floor (log line-size 2)))
- (mask 0))
- (dotimes (i cache-size-in-bits) (setq mask (dpb 1 (byte 1 i) mask)))
- (dotimes (i line-size-in-bits) (setq mask (dpb 0 (byte 1 i) mask)))
- mask))
-
- *** Optimizations ***
-
- This section discusses two important optimizations related to LAP
- closures. The first relates to calling LAP closures themselves, the
- second relates to calling other functions from LAP closures.
-
- The important point about calling LAP closures is that almost all of the
- time, LAP closures will be used as the funcallable-instance-function of
- funcallable instances. It is required that LAP closures be funcallable
- themselves, but usually they will be stored in a FIN and the fin will
- then be funcalled. This brings up several optimizations, including ones
- having to do with access to the closure variables of a LAP closure.
-
- When a LAP closure is used to do method lookup, the function the LAP
- closure ends up calling has the same number of required arguments as the
- LAP closure itself. Since the LAP closure must check its required
- arguments to do the lookup, it is redundant for the function called to
- do so as well. Since LAP closures do all calls in a tail recursive way,
- it should even be possible to optimize out certain parts of the normal
- stack frame initialization.
-
- A similar situation occurs between effective method functions and the
- individual method functions; the difference is that in effective method
- functions, the calls are not necessarily tail recursive.
-
- Consequently, it would be nice to have a way to call certain functions
- and inhibit the checking of required arguments. This is made possible
- by use of the PCL-FAST-APPLY and PCL-FAST-FUNCALL macros together with
- the PCL-FAST-CALL compiler declaration.
-
- The PCL-FAST-CALL compiler declaration declares that a function may be
- fast called. Not all callers of the function will necessarily fast call
- it, but most probably will. The :JMP opcode can only be used to call a
- function compiled with the PCL-FAST-CALL declaration.
-
- The PCL-FAST-APPLY and PCL-FAST-FUNCALL macros are used to fast call a
- function. The function argument must be a compiled function that has
- the PCL-FAST-CALL compiler declaration in its lambda declarations.
-
- The basic idea is that the PCL-FAST-CALL compiler declaration causes the
- compiler to set up an additional entrypoint to the function. This
- entrypoint comes after checking of required arguments but before
- processing of other arguments.
-
- Note: When FAST-APPLY is used, the required arguments will be given as
- separate arguments and all other arguments will appear as a single
- spread argument. For example:
-
- (let ((fn (compile () '(lambda (a b &optional (c 'z))
- (declare (pcl-fast-call))
- (list a b c)))))
-
- (pcl-fast-apply fn 'x 'y ()) ;legal
- (pcl-fast-apply fn 'x 'y '(foo)) ;legal
- (pcl-fast-apply fn '(a b c)) ;illegal
- )
-
- *** Producing LAP Closure Generators ***
-
- Each implementation of the LAP code mechanism must provide a port
- specific function making lap closure generators. In the portable
- implementation, this function is called PLAP-CLOSURE-GENERATOR. In
- ExCL it should be called EXCL-LAP-CLOSURE-GENERATOR etc.
-
- At any time, the value of the variable *make-lap-closure-generator* is a
- symbol which names the function currently being used to make lap closure
- generators.
-
- The port specific function must accept arguments as follows:
-
- PLAP-CLOSURE-GENERATOR (<closure-vars>
- <args>
- <index-registers>
- <simple-vector-registers>
- <lap-code>)
-
- This returns a lap-closure generator. A lap-closure generator is a
- function which is called with a number of arguments equal to the length
- of <closure-vars>. These arguments are the values of the closure
- variables for the lap closure. These values cannot be changed once the
- LAP closure is created. PCL takes care of keeping track of
- lap-closure-generators it already has on hand and reusing them. The
- function RESET-LAP-CLOSURE-GENERATORS can be called to force PCL to
- forget all the lap closure generators it has remembered.
-
- <args>
-
- A list of symbols. This provides a way to name particular arguments to
- the LAP closure. Arguments which will not be referenced by name are
- given as NIL. All required arguments to the LAP closure are explicitly
- included (perhaps as NIL). If &REST appears at the end of arguments it
- means that non-required arguments are allowed, these will be processed
- by the methods. If &REST does not appear at the end of arguments, the
- lap closure should signal an error if more than the indicated number of
- arguments are supplied.
-
- Examples:
-
- - (obj-0 obj-1)
-
- Specifies a two argument lap closure. If more or less than
- two arguments are supplied an error is signalled. Within
- the actual lap code, both arguments can be referenced by
- name (see the :ARG operand).
-
- - (obj-0 nil &rest)
-
- Specifies a two or more argument lap closure. If less than
- two arguments are supplied an error is signalled. Within
- the actual lap code, the first argument can be referenced by
- name (see the :ARG operand).
-
-
- <closure-vars>
-
- A list of symbols. The closure will have these as closure variables.
- Within the lap code these can be accessed using the :CVAR operand. The
- lap code cannot change these values. SET-FUNCALLABLE-INSTANCE-FUNCTION
- is permitted to have the special knowledge that there are at most ?? of
- these and to be prepared to do something special when the funcallable
- instance function of a funcallable instance is set to a lap closure.
-
- <index-registers>
-
- A list of register numbers. These registers will be used only to hold
- indexes. Other registers may be used to hold indexes as well, but the
- only values put into these registers will be indexes.
-
- <simple-vector-registers>
-
- A list of register numbers. These registers will be used only to hold
- simple-vectors. Other registers may be used to hold simple-vectors as
- well, but the only values put into these registers will be
- simple-vectors.
-
-
- <lap-code>
-
- The actual lap code for this closure. This is a list of LAP code
- opcodes. See the section "Abstract LAP Code" for more details.
-
- Each implementation must also supply a function named PRE-MAKE-xxx where
- xxx is the same as the name of its make-lap-closure-generator function.
- The macro doesn't evaluate its arguments, and when it appears in a file
- it should try to do some of the work at load time. It might appear in a
- file like this:
-
- (eval-when (load)
- (setq 1-arg-std-lap
- (pre-make-plap-closure-generator ...)))
-
- *** Abstract LAP Code ***
-
- Each lap code operand has the form: (opcode operand1 ... operandn).
-
- In some cases, the distinction between an operand and an opcode is
- somewhat arbitrary. In general, opcodes have a significant "action"
- component to their behavior. Operands select a piece of data to operate
- on. Some operands select their data in a more complex way, but they are
- operands anyways.
-
- All data must be in a register before it can be operated on. This
- requirement means that the only place a non-register operand can appear
- is as the first argument to the :move opcode. (Actually, there is one
- other exception, a :iref operand can be the target of a move as well.)
- Moreover, only register operands can appear as the second argument to
- the :move opcode and this register must not appear in the <from>
- operand.
-
- >> The operands are:
-
- (:reg <n>)
-
- A pseudo register. <n> is an integer in the range [0 , 31].
-
- A particular implementation can map this to a real register, a memory
- location or the stack. The abstract LAP code itself does not include
- the notion of a stack.
-
- PCL will attempt to optimize register use in two ways. PCL itself will
- attempt to re-use registers whenever possible. That is, the port should
- not have to worry with doing live register analysis for the registers.
- In addition, PCL will consider lower numbered registers to be "faster"
- than higher numbered ones.
-
-
- (:cvar <name>)
-
- A closure variable of the lap-closure. <name> is a symbol.
-
-
- (:arg <name>)
-
- An argument to the LAP closure. <name> is a symbol.
-
- (:std-wrapper <from>)
- (:fsc-wrapper <from>)
- (:built-in-wrapper <from>)
- (:structure-wrapper <from>)
- (:other-wrapper <from>)
-
- Get the class wrapper of <from>. For std-instances and fsc-instances
- this just fetches the wrapper field. The specific port is required to
- implement fast access to the wrappers of built-in, structure and other
- metatypes. A callback mechanism allows the port to ask PCL to generate
- a class and wrapper for objects for which no class and wrapper exists
- yet. This mechanism is <<to be spec'd out>>.
-
-
- (:std-slots <operand>)
- (:fsc-slots <operand>)
-
- Fetch the slots field of a std-instance or a fsc-instance.
-
- (:constant <constant>)
-
- This just allows inline constants. <constant> can be any Lisp object.
-
- The following operands operate on indexes. Each is patterned after a
- Lisp function which would have a corresponding effect on the index value
- of the index.
-
- (:i1+ <index>)
- (:i+ <index-1> <index-2>)
- (:i- <index-1> <index-2>)
- (:ilogand <index-1> <index-2>)
- (:ilogxor <index-1> <index-2>)
-
- Like the corresponding Lisp functions.
-
-
- (:iref <vector> <index>)
-
- Like the SVREF function. <vector> must be a simple vector.
-
- (:cref <vector> <constant>)
-
- The :cref operand is for constant vector references. <constant> must be
- a fixnum.
-
- >> The opcodes are:
-
- (:move <from> <to>)
-
- A full word move operation.
-
-
- (:eq <from1> <from2> <label>)
- (:neq <from1> <from2> <label>)
-
- EQ and NOT EQ conditional branches. If the value contained in <from1>
- is EQ (or not) to the value contained in <from2>, jump to <label>.
- Otherwise continue execution at the next opcode. <label> is a symbol.
-
- (:fix= <from1> <from2> <label>)
-
- A fixnum = conditional branch.
-
- (:izerop <from> <label>)
-
- Jump to label if and only if the index <from> is 0.
-
- (:std-instance-p <from> <destination-label>)
- (:fsc-instance-p <from> <destination-label>)
- (:built-in-instance-p <from> <destination-label>)
- (:structure-instance-p <from> <destination-label>)
-
- Test the metatype of <from> and dispatch. If the metatype of from is of
- the specified type execution jumps to <destination-label>. Otherwise,
- execution proceeds normally at the next opcode. See the :xxx-wrapper
- operands.
-
- (:jmp <function>)
-
- fcn contains a function to call. This must be a compiled function,
- which had the PCL-FAST-CALL declaration in it. The call should be "tail
- recursive" in that whatever values it returns should be returned
- immediately from the LAP closure itself.
-
- Method lookup is implemented by finding a function in the cache and then
- using :JMP to call it. The various kinds of traps are implemented by
- using :JMP to call a trap function that was stored in one of the closure
- variables.
-
- (:return <value>)
-
- Return immediately from the LAP closure. <value> is the single value to
- return.
-
- In certain cases of method lookup when all the methods are accessor methods,
- there is an important optimization which implements the slot access in the
- discriminating function itself. This opcode is used in that case.
-
- (:label <label>)
-
- Identifies a point in the lap code. <label> is a symbol. This can be
- the target of any of the control transfer opcodes (:GO, :EQ, :NEQ,
- :FIX=, :STD-INSTANCE-P, :FSC-INSTANCE-P, :STRUCTURE-INSTANCE-P,
- :BUILT-IN-INSTANCE-P)
-
- (:go <label>)
-
- Jump to the label <label>. <label> is a symbol.
-
- *** Examples ***
-
- Here is an example of the use of the abstract LAP mechanism. It doesn't
- use all operands or opcodes, but it is representative of the LAP
- sequences PCL will use.
-
- Several things are worth noting:
- * This is, I believe, just about the simplest such sequence. There
- are some others of comparable simplicity, but none much simpler.
-
- * A total of only 5 registers are used. I haven't checked, but I
- am pretty sure most all such code sequences will get by with 16
- or less and many will stay under 8.
-
- (defun make-1-arg-std-dfun ()
- (let ((cg
- (making-lap-closure-generator
- (initialize-lap-cvars '(cache mask reflect trap))
- (initialize-lap-args '(a0))
- (initialize-lap-registers 4 4 3)
- (let ((cache (allocate-lap-register 'simple-vector))
- (count (allocate-lap-register))
- (wrapper (allocate-lap-register))
- (index (allocate-lap-register 'index))
- (t1 (allocate-lap-register))
- (t2 (allocate-lap-register))
- (i1 (allocate-lap-register 'index))
- (i2 (allocate-lap-register 'index)))
-
- (opcode :move (operand :cvar 'cache) cache)
- (opcode :move (operand :arg 'a0) t1)
- (opcode :std-instance-p t1 'std-instance)
- (opcode :go 'trap)
- (opcode :label 'std-instance)
- ;;
- ;; The stable register pattern for the rest of the code is:
- ;; cache Cache
- ;; count Cache count
- ;; wrapper Wrapper
- ;; index index under consideration
- ;;
- ;; Whenever we jump to HIT, this pattern must be in force.
- ;;
- (opcode :move (operand :std-wrapper t1) wrapper);
- (opcode :move (operand :cref cache 0) count) ;
- (opcode :move (operand :cref wrapper 0) i2) ;wrapper-no -> i2
- (opcode :move (operand :cvar 'mask) i1) ;mask -> i1
- (opcode :move (operand :ilogand i1 i2) index) ;
- ;
- (opcode :move (operand :iref cache index) t1) ;key -> t1
- (opcode :eq t1 wrapper 'hit) ;
- (opcode :move (operand :cvar 'reflect) i1) ;reflect -> i1
- (opcode :move index i2) ;index -> i2
- (opcode :move (operand :i- i1 i2) index) ;
- (opcode :move (operand :iref cache index) t1) ;key -> t1
- (opcode :eq t1 wrapper 'hit) ;
- (opcode :go 'trap)
-
- ;;
- ;; To do the hit, we use registers as follows:
- ;; 0 cache comes in here
- ;; 1 cache count comes in here
- ;; 2 use this for the function
- ;; 3 index comes in here
- ;; 4 1+ index, then new count
- ;;
- (opcode :label 'hit)
- (opcode :move (operand :i1+ index) i1)
- (opcode :move (operand :iref cache i1) t1)
-
- (opcode :move (operand :cref cache 0) t2)
- (opcode :fix= count t2 'call)
- (opcode :go 'trap)
-
- (opcode :label 'call)
- (opcode :jmp t1)
-
- (opcode :label 'trap)
- (opcode :move (operand :cvar 'trap) t1)
- (opcode :jmp t1)))))
-
- (funcall cg
- (make-array 16)
- (make-index-mask 16 2)
- (- 16 2)
- #'(lambda (a)
- (declare (pcl-fast-call) (ignore a))
- (break "Trap")))))
-
-